Automated Test: authz-service-improve-caching-pr #341

admin-coderabbit · 2026-02-04T19:47:44Z

This pull request was automatically created by @coderabbitai/e2e-reviewer.

Batch created pull request.

Summary by CodeRabbit

Refactor
- Restructured authorization caching to implement denial-specific caching and streamlined identity permission retrieval, reducing database queries on repeated authorization checks.
Tests
- Expanded test coverage for cache behavior across authorization checks and list operations, including cache-hit, cache-miss, and expiration scenarios.

* remove the use of client side cache for in-proc authz client Co-authored-by: Gabriel MABILLE <gabriel.mabille@grafana.com> * add a permission denial cache, fetch perms if not in either of the caches Co-authored-by: Gabriel MABILLE <gabriel.mabille@grafana.com> * Clean up tests Co-authored-by: Ieva <ieva.vasiljeva@grafana.com> * Cache tests Co-authored-by: Ieva <ieva.vasiljeva@grafana.com> * Add test to list + cache Co-authored-by: Ieva <ieva.vasiljeva@grafana.com> * Add outdated cache test Co-authored-by: Ieva <ieva.vasiljeva@grafana.com> * Re-organize metrics Co-authored-by: Ieva <ieva.vasiljeva@grafana.com> --------- Co-authored-by: Gabriel MABILLE <gabriel.mabille@grafana.com>

coderabbit-eval · 2026-02-04T19:49:24Z

📝 Walkthrough

Walkthrough

The changes introduce a denial caching mechanism to the RBAC authorization service. A new NoopCache type is added for in-process paths, the Service gains a permDenialCache field, and permission checks now leverage cached identity permissions with explicit denial entries to reduce database calls.

Changes

Cohort / File(s)	Summary
Cache Infrastructure `pkg/services/authz/rbac.go`, `pkg/services/authz/rbac/cache.go`	Introduces NoopCache type implementing the cache interface for in-process channels and adds userPermDenialCacheKey function to generate denial-specific cache keys. Refactors RBAC client initialization to use authzlib.NewClient with appropriate cache strategies (NoopCache for in-proc, LocalCache for remote).
Service Implementation `pkg/services/authz/rbac/service.go`	Adds permDenialCache field to Service and wires it into initialization. Introduces getCachedIdentityPermissions helper function to centralize identity-based permission caching. Refactors Check and List methods to check denial cache first and fall back to cached identity permissions, reducing database calls on cache hits.
Tests `pkg/services/authz/rbac/service_test.go`	Removes legacy cache test case from TestService_getUserPermissions. Adds TestService_CacheCheck and TestService_CacheList to validate caching behavior including cache hits, misses, and explicit denial entries. Updates setupService to initialize permDenialCache.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Service
    participant DenialCache as Denial Cache
    participant IdentCache as Identity Cache
    participant Database

    Client->>Service: Check(permission)
    Service->>DenialCache: Get denial entry
    alt Denial cached
        DenialCache-->>Service: Denied (cached)
        Service-->>Client: Allowed: false
    else No denial cached
        Service->>IdentCache: Get cached permissions
        alt Cached permissions hit
            IdentCache-->>Service: Permissions (cached)
            Service->>Service: Evaluate permission
            alt Permission allowed
                Service-->>Client: Allowed: true
            else Permission denied
                Service->>DenialCache: Set denial entry
                Service-->>Client: Allowed: false
            end
        else Cache miss
            Service->>Database: Fetch identity permissions
            Database-->>Service: Permissions
            Service->>IdentCache: Store permissions
            Service->>Service: Evaluate permission
            alt Permission allowed
                Service-->>Client: Allowed: true
            else Permission denied
                Service->>DenialCache: Set denial entry
                Service-->>Client: Allowed: false
            end
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 A denial cache now guards the gate,
With speedy shortcuts, no more wait!
Identity permissions cached so tight,
The service dances left and right,
Permission checks now take their flight! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Automated Test: authz-service-improve-caching-pr' is overly vague and generic. It reads as a test identifier rather than a descriptive summary of the actual changes (adding denial caching, improving identity permission caching in the RBAC authorization service).	Use a more descriptive title that captures the main change, such as 'Improve RBAC authorization caching with denial cache and identity permission optimization' or 'Add denial caching and optimize identity permission caching in RBAC service'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch authz-service-improve-caching-pr

_{Comment @coderabbit eval help to get the list of available commands and usage tips.}

coderabbit-eval

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@pkg/services/authz/rbac/service.go`:
- Around line 116-137: The permission cache metric is being recorded as a miss
even when getCachedIdentityPermissions(...) returns a cache hit but
checkPermission(...) yields allowed=false; update the logic around
permDenialKey, getCachedIdentityPermissions, and checkPermission so that
metrics.permissionCacheUsage reflects a cache hit in the err==nil branch (e.g.,
call s.metrics.permissionCacheUsage.WithLabelValues("true",
checkReq.Action).Inc() when cachedPerms were found regardless of allowed result,
or add a distinct label like "hit_denied") and only record "false" for real
cache misses outside that branch.

🧹 Nitpick comments (5)

pkg/services/authz/rbac/cache.go (1)
30-32: Cache key collision risk with _ separator.

The denial cache key shares the same prefix format as userPermCacheKey (namespace + ".perm_" + userUID + "_" + action), with name and parent appended using the same _ delimiter. If any field value ever contains _, keys could collide—both across the two key types and within denial keys themselves (e.g., name="a_b", parent="c" vs name="a", parent="b_c").

Current resource names/UIDs likely don't contain _, so the practical risk is low, but consider using a more robust separator (e.g., / or a multi-char delimiter) to make the key scheme collision-proof.
💡 Possible fix
 func userPermDenialCacheKey(namespace, userUID, action, name, parent string) string {
-	return namespace + ".perm_" + userUID + "_" + action + "_" + name + "_" + parent
+	return namespace + ".perm_denial_" + userUID + "/" + action + "/" + name + "/" + parent
 }
Using a distinct prefix (perm_denial_ vs perm_) and a different separator (/) eliminates any ambiguity with userPermCacheKey.
pkg/services/authz/rbac.go (1)
239-251: Receiver name lc is misleading for NoopCache.

The receiver is named lc (likely copied from a LocalCache pattern), but this is a NoopCache. Consider renaming to nc or c for consistency.
💡 Proposed fix
-func (lc *NoopCache) Get(ctx context.Context, key string) ([]byte, error) {
+func (nc *NoopCache) Get(ctx context.Context, key string) ([]byte, error) {
 	return nil, cache.ErrNotFound
 }
 
-func (lc *NoopCache) Set(ctx context.Context, key string, data []byte, exp time.Duration) error {
+func (nc *NoopCache) Set(ctx context.Context, key string, data []byte, exp time.Duration) error {
 	return nil
 }
 
-func (lc *NoopCache) Delete(ctx context.Context, key string) error {
+func (nc *NoopCache) Delete(ctx context.Context, key string) error {
 	return nil
 }
pkg/services/authz/rbac/service.go (2)

342-368: getCachedIdentityPermissions performs a DB call for user identifiers, partially negating the "cache-only" intent.

For TypeUser/TypeServiceAccount, GetUserIdentifiers (line 356) may hit the database if the identity isn't in idCache. This means the "fast cached path" can still trigger a DB round-trip just to construct the perm-cache key, even if the perm cache itself turns out to be empty. The idCache has a longer TTL (2 min) so this is mostly mitigated after the first request, but on a cold cache or after TTL expiry, every Check/List call pays this cost upfront.

If this is intentional, a brief code comment explaining the tradeoff would help future maintainers.

153-155: Denial cache entries may outlive the permission cache, creating a brief stale-denial window.

Both permCache and permDenialCache use shortCacheTTL (30s), but the denial entry is written after the perm entry (which is set inside getUserPermissions). When a permission is granted in the DB shortly after a denial is cached, the denial entry can outlive the perm entry by the delta between the two writes, causing the fast-path at line 117 to return a stale denial even after permCache refreshes.

In practice this delta is small (milliseconds), and the 30s TTL makes any staleness transient. However, it's worth documenting this known tradeoff (or setting the denial TTL slightly shorter than the perm TTL) so operators understand the behavior during permission changes.
pkg/services/authz/rbac/service_test.go (1)
973-995: Test doesn't conclusively prove the denial cache short-circuits.

The permCache entry has "dashboards:uid:dash1": false, which would also result in a denial through the normal permission-check path. If the denial cache were removed, the test would still pass (via the perm check + DB fallback path). To truly verify that the denial cache takes priority, set a permCache value that would allow the request:
-		s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": false})
+		s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": true})
This way, if the denial cache is bypassed, checkPermission would return allowed=true, and the test would fail—proving the denial cache is the decisive factor.

coderabbit-eval · 2026-02-04T19:59:04Z

pkg/services/authz/rbac/service.go

+	permDenialKey := userPermDenialCacheKey(checkReq.Namespace.Value, checkReq.UserUID, checkReq.Action, checkReq.Name, checkReq.ParentFolder)
+	if _, ok := s.permDenialCache.Get(ctx, permDenialKey); ok {
+		s.metrics.permissionCacheUsage.WithLabelValues("true", checkReq.Action).Inc()
+		s.metrics.requestCount.WithLabelValues("false", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc()
+		return &authzv1.CheckResponse{Allowed: false}, nil
+	}
+
+	cachedPerms, err := s.getCachedIdentityPermissions(ctx, checkReq.Namespace, checkReq.IdentityType, checkReq.UserUID, checkReq.Action)
+	if err == nil {
+		allowed, err := s.checkPermission(ctx, cachedPerms, checkReq)
+		if err != nil {
+			ctxLogger.Error("could not check permission", "error", err)
+			s.metrics.requestCount.WithLabelValues("true", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc()
+			return deny, err
+		}
+		if allowed {
+			s.metrics.permissionCacheUsage.WithLabelValues("true", checkReq.Action).Inc()
+			s.metrics.requestCount.WithLabelValues("false", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc()
+			return &authzv1.CheckResponse{Allowed: allowed}, nil
+		}
+	}
+	s.metrics.permissionCacheUsage.WithLabelValues("false", checkReq.Action).Inc()


⚠️ Potential issue | 🟡 Minor

Misleading cache-usage metric when cache hits but check denies.

When getCachedIdentityPermissions returns cached permissions (cache hit) but checkPermission returns allowed=false, execution falls through to line 137 which unconditionally records a cache miss ("false"). The cache was found and consulted; the data simply didn't produce an allow. This inflates the cache-miss rate and deflates the cache-hit rate in monitoring, making it harder to assess actual cache effectiveness.

Consider recording the metric inside the if err == nil branch to reflect that the cache was hit, or use a distinct label (e.g., "hit_denied") to differentiate this path from a true cache miss.

💡 Proposed fix

cachedPerms, err := s.getCachedIdentityPermissions(ctx, checkReq.Namespace, checkReq.IdentityType, checkReq.UserUID, checkReq.Action) if err == nil { allowed, err := s.checkPermission(ctx, cachedPerms, checkReq) if err != nil { ctxLogger.Error("could not check permission", "error", err) s.metrics.requestCount.WithLabelValues("true", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc() return deny, err } if allowed { s.metrics.permissionCacheUsage.WithLabelValues("true", checkReq.Action).Inc() s.metrics.requestCount.WithLabelValues("false", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc() return &authzv1.CheckResponse{Allowed: allowed}, nil } + // Cache was present but didn't grant access — still a cache hit (just not conclusive for allow) + s.metrics.permissionCacheUsage.WithLabelValues("false", checkReq.Action).Inc() + } else { + s.metrics.permissionCacheUsage.WithLabelValues("false", checkReq.Action).Inc() } - s.metrics.permissionCacheUsage.WithLabelValues("false", checkReq.Action).Inc()

🤖 Prompt for AI Agents

In `@pkg/services/authz/rbac/service.go` around lines 116 - 137, The permission cache metric is being recorded as a miss even when getCachedIdentityPermissions(...) returns a cache hit but checkPermission(...) yields allowed=false; update the logic around permDenialKey, getCachedIdentityPermissions, and checkPermission so that metrics.permissionCacheUsage reflects a cache hit in the err==nil branch (e.g., call s.metrics.permissionCacheUsage.WithLabelValues("true", checkReq.Action).Inc() when cachedPerms were found regardless of allowed result, or add a distinct label like "hit_denied") and only record "false" for real cache misses outside that branch.

coderabbit-eval bot closed this Feb 4, 2026

coderabbit-eval bot reviewed Feb 4, 2026

View reviewed changes

Automated Test: authz-service-improve-caching-pr #341

Automated Test: authz-service-improve-caching-pr #341

admin-coderabbit commented Feb 4, 2026 •

edited by coderabbit-eval bot

Loading

coderabbit-eval bot commented Feb 4, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

coderabbit-eval bot left a comment

coderabbit-eval bot Feb 4, 2026

Automated Test: authz-service-improve-caching-pr #341

Automated Test: authz-service-improve-caching-pr #341

Conversation

admin-coderabbit commented Feb 4, 2026 • edited by coderabbit-eval bot Loading

Summary by CodeRabbit

coderabbit-eval bot commented Feb 4, 2026 • edited Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

coderabbit-eval bot left a comment

Choose a reason for hiding this comment

coderabbit-eval bot Feb 4, 2026

Choose a reason for hiding this comment

admin-coderabbit commented Feb 4, 2026 •

edited by coderabbit-eval bot

Loading

coderabbit-eval bot commented Feb 4, 2026 •

edited

Loading